

<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
  <meta charset="utf-8">
  
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  
  <title>2.5 字符串搜索和替换 &mdash; python3-cookbook 3.0.0 documentation</title>
  

  
  
  
  

  

  
  
    

  

  <link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
  <link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
    <link rel="index" title="Index" href="../genindex.html" />
    <link rel="search" title="Search" href="../search.html" />
    <link rel="next" title="2.6 字符串忽略大小写的搜索替换" href="p06_search_replace_case_insensitive.html" />
    <link rel="prev" title="2.4 字符串匹配和搜索" href="p04_match_and_search_text.html" /> 

  
  <script src="../_static/js/modernizr.min.js"></script>

</head>

<body class="wy-body-for-nav">

   
  <div class="wy-grid-for-nav">

    
    <nav data-toggle="wy-nav-shift" class="wy-nav-side">
      <div class="wy-side-scroll">
        <div class="wy-side-nav-search">
          

          
            <a href="../index.html" class="icon icon-home"> python3-cookbook
          

          
          </a>

          
            
            
              <div class="version">
                3.0
              </div>
            
          

          
<div role="search">
  <form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
    <input type="text" name="q" placeholder="Search docs" />
    <input type="hidden" name="check_keywords" value="yes" />
    <input type="hidden" name="area" value="default" />
  </form>
</div>

          
        </div>

        <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
          
            
            
              
            
            
              <ul class="current">
<li class="toctree-l1"><a class="reference internal" href="../chapters/p01_data_structures_algorithms.html">第一章：数据结构和算法</a></li>
<li class="toctree-l1 current"><a class="reference internal" href="../chapters/p02_strings_and_text.html">第二章：字符串和文本</a><ul class="current">
<li class="toctree-l2"><a class="reference internal" href="p01_split_string_on_multiple_delimiters.html">2.1 使用多个界定符分割字符串</a></li>
<li class="toctree-l2"><a class="reference internal" href="p02_match_text_at_start_end.html">2.2 字符串开头或结尾匹配</a></li>
<li class="toctree-l2"><a class="reference internal" href="p03_match_strings_with_shell_wildcard.html">2.3 用Shell通配符匹配字符串</a></li>
<li class="toctree-l2"><a class="reference internal" href="p04_match_and_search_text.html">2.4 字符串匹配和搜索</a></li>
<li class="toctree-l2 current"><a class="current reference internal" href="#">2.5 字符串搜索和替换</a><ul>
<li class="toctree-l3"><a class="reference internal" href="#id2">问题</a></li>
<li class="toctree-l3"><a class="reference internal" href="#id3">解决方案</a></li>
<li class="toctree-l3"><a class="reference internal" href="#id4">讨论</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="p06_search_replace_case_insensitive.html">2.6 字符串忽略大小写的搜索替换</a></li>
<li class="toctree-l2"><a class="reference internal" href="p07_specify_regexp_for_shortest_match.html">2.7 最短匹配模式</a></li>
<li class="toctree-l2"><a class="reference internal" href="p08_regexp_for_multiline_partterns.html">2.8 多行匹配模式</a></li>
<li class="toctree-l2"><a class="reference internal" href="p09_normalize_unicode_text_to_regexp.html">2.9 将Unicode文本标准化</a></li>
<li class="toctree-l2"><a class="reference internal" href="p10_work_with_unicode_in_regexp.html">2.10 在正则式中使用Unicode</a></li>
<li class="toctree-l2"><a class="reference internal" href="p11_strip_unwanted_characters.html">2.11 删除字符串中不需要的字符</a></li>
<li class="toctree-l2"><a class="reference internal" href="p12_sanitizing_clean_up_text.html">2.12 审查清理文本字符串</a></li>
<li class="toctree-l2"><a class="reference internal" href="p13_aligning_text_strings.html">2.13 字符串对齐</a></li>
<li class="toctree-l2"><a class="reference internal" href="p14_combine_and_concatenate_strings.html">2.14 合并拼接字符串</a></li>
<li class="toctree-l2"><a class="reference internal" href="p15_interpolating_variables_in_strings.html">2.15 字符串中插入变量</a></li>
<li class="toctree-l2"><a class="reference internal" href="p16_reformat_text_to_fixed_number_columns.html">2.16 以指定列宽格式化字符串</a></li>
<li class="toctree-l2"><a class="reference internal" href="p17_handle_html_xml_in_text.html">2.17 在字符串中处理html和xml</a></li>
<li class="toctree-l2"><a class="reference internal" href="p18_tokenizing_text.html">2.18 字符串令牌解析</a></li>
<li class="toctree-l2"><a class="reference internal" href="p19_writing_recursive_descent_parser.html">2.19 实现一个简单的递归下降分析器</a></li>
<li class="toctree-l2"><a class="reference internal" href="p20_perform_text_operations_on_byte_string.html">2.20 字节字符串上的字符串操作</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../aboutme.html">关于</a></li>
</ul>

            
          
        </div>
      </div>
    </nav>

    <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">

      
      <nav class="wy-nav-top" aria-label="top navigation">
        
          <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
          <a href="../index.html">python3-cookbook</a>
        
      </nav>


      <div class="wy-nav-content">
        
        <div class="rst-content">
        
          















<div role="navigation" aria-label="breadcrumbs navigation">

  <ul class="wy-breadcrumbs">
    
      <li><a href="../index.html">Docs</a> &raquo;</li>
        
          <li><a href="../chapters/p02_strings_and_text.html">第二章：字符串和文本</a> &raquo;</li>
        
      <li>2.5 字符串搜索和替换</li>
    
    
      <li class="wy-breadcrumbs-aside">
        
            
            <a href="../_sources/c02/p05_search_and_replace_text.rst.txt" rel="nofollow"> View page source</a>
          
        
      </li>
    
  </ul>

  
  <hr/>
</div>
          <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
           <div itemprop="articleBody">
            
  <div class="section" id="id1">
<h1>2.5 字符串搜索和替换<a class="headerlink" href="#id1" title="Permalink to this headline">¶</a></h1>
<div class="section" id="id2">
<h2>问题<a class="headerlink" href="#id2" title="Permalink to this headline">¶</a></h2>
<p>你想在字符串中搜索和匹配指定的文本模式</p>
</div>
<div class="section" id="id3">
<h2>解决方案<a class="headerlink" href="#id3" title="Permalink to this headline">¶</a></h2>
<p>对于简单的字面模式，直接使用 <code class="docutils literal notranslate"><span class="pre">str.replace()</span></code> 方法即可，比如：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">text</span> <span class="o">=</span> <span class="s1">&#39;yeah, but no, but yeah, but no, but yeah&#39;</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">text</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="s1">&#39;yeah&#39;</span><span class="p">,</span> <span class="s1">&#39;yep&#39;</span><span class="p">)</span>
<span class="go">&#39;yep, but no, but yep, but no, but yep&#39;</span>
<span class="go">&gt;&gt;&gt;</span>
</pre></div>
</div>
<p>对于复杂的模式，请使用 <code class="docutils literal notranslate"><span class="pre">re</span></code> 模块中的 <code class="docutils literal notranslate"><span class="pre">sub()</span></code> 函数。
为了说明这个，假设你想将形式为 <code class="docutils literal notranslate"><span class="pre">11/27/2012</span></code> 的日期字符串改成 <code class="docutils literal notranslate"><span class="pre">2012-11-27</span></code> 。示例如下：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">text</span> <span class="o">=</span> <span class="s1">&#39;Today is 11/27/2012. PyCon starts 3/13/2013.&#39;</span>
<span class="gp">&gt;&gt;&gt; </span><span class="kn">import</span> <span class="nn">re</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">re</span><span class="o">.</span><span class="n">sub</span><span class="p">(</span><span class="sa">r</span><span class="s1">&#39;(\d+)/(\d+)/(\d+)&#39;</span><span class="p">,</span> <span class="sa">r</span><span class="s1">&#39;\3-\1-\2&#39;</span><span class="p">,</span> <span class="n">text</span><span class="p">)</span>
<span class="go">&#39;Today is 2012-11-27. PyCon starts 2013-3-13.&#39;</span>
<span class="go">&gt;&gt;&gt;</span>
</pre></div>
</div>
<p><code class="docutils literal notranslate"><span class="pre">sub()</span></code> 函数中的第一个参数是被匹配的模式，第二个参数是替换模式。反斜杠数字比如 <code class="docutils literal notranslate"><span class="pre">\3</span></code> 指向前面模式的捕获组号。</p>
<p>如果你打算用相同的模式做多次替换，考虑先编译它来提升性能。比如：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="kn">import</span> <span class="nn">re</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">datepat</span> <span class="o">=</span> <span class="n">re</span><span class="o">.</span><span class="n">compile</span><span class="p">(</span><span class="sa">r</span><span class="s1">&#39;(\d+)/(\d+)/(\d+)&#39;</span><span class="p">)</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">datepat</span><span class="o">.</span><span class="n">sub</span><span class="p">(</span><span class="sa">r</span><span class="s1">&#39;\3-\1-\2&#39;</span><span class="p">,</span> <span class="n">text</span><span class="p">)</span>
<span class="go">&#39;Today is 2012-11-27. PyCon starts 2013-3-13.&#39;</span>
<span class="go">&gt;&gt;&gt;</span>
</pre></div>
</div>
<p>对于更加复杂的替换，可以传递一个替换回调函数来代替，比如：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span class="nn">calendar</span> <span class="kn">import</span> <span class="n">month_abbr</span>
<span class="gp">&gt;&gt;&gt; </span><span class="k">def</span> <span class="nf">change_date</span><span class="p">(</span><span class="n">m</span><span class="p">):</span>
<span class="gp">... </span><span class="n">mon_name</span> <span class="o">=</span> <span class="n">month_abbr</span><span class="p">[</span><span class="nb">int</span><span class="p">(</span><span class="n">m</span><span class="o">.</span><span class="n">group</span><span class="p">(</span><span class="mi">1</span><span class="p">))]</span>
<span class="gp">... </span><span class="k">return</span> <span class="s1">&#39;{} {} {}&#39;</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">m</span><span class="o">.</span><span class="n">group</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">mon_name</span><span class="p">,</span> <span class="n">m</span><span class="o">.</span><span class="n">group</span><span class="p">(</span><span class="mi">3</span><span class="p">))</span>
<span class="gp">...</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">datepat</span><span class="o">.</span><span class="n">sub</span><span class="p">(</span><span class="n">change_date</span><span class="p">,</span> <span class="n">text</span><span class="p">)</span>
<span class="go">&#39;Today is 27 Nov 2012. PyCon starts 13 Mar 2013.&#39;</span>
<span class="go">&gt;&gt;&gt;</span>
</pre></div>
</div>
<p>一个替换回调函数的参数是一个 <code class="docutils literal notranslate"><span class="pre">match</span></code> 对象，也就是 <code class="docutils literal notranslate"><span class="pre">match()</span></code> 或者 <code class="docutils literal notranslate"><span class="pre">find()</span></code> 返回的对象。
使用 <code class="docutils literal notranslate"><span class="pre">group()</span></code> 方法来提取特定的匹配部分。回调函数最后返回替换字符串。</p>
<p>如果除了替换后的结果外，你还想知道有多少替换发生了，可以使用 <code class="docutils literal notranslate"><span class="pre">re.subn()</span></code> 来代替。比如：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">newtext</span><span class="p">,</span> <span class="n">n</span> <span class="o">=</span> <span class="n">datepat</span><span class="o">.</span><span class="n">subn</span><span class="p">(</span><span class="sa">r</span><span class="s1">&#39;\3-\1-\2&#39;</span><span class="p">,</span> <span class="n">text</span><span class="p">)</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">newtext</span>
<span class="go">&#39;Today is 2012-11-27. PyCon starts 2013-3-13.&#39;</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">n</span>
<span class="go">2</span>
<span class="go">&gt;&gt;&gt;</span>
</pre></div>
</div>
</div>
<div class="section" id="id4">
<h2>讨论<a class="headerlink" href="#id4" title="Permalink to this headline">¶</a></h2>
<p>关于正则表达式搜索和替换，上面演示的 <code class="docutils literal notranslate"><span class="pre">sub()</span></code> 方法基本已经涵盖了所有。
其实最难的部分就是编写正则表达式模式，这个最好是留给读者自己去练习了。</p>
</div>
</div>


           </div>
           
          </div>
          <footer>
  
    <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
      
        <a href="p06_search_replace_case_insensitive.html" class="btn btn-neutral float-right" title="2.6 字符串忽略大小写的搜索替换" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right"></span></a>
      
      
        <a href="p04_match_and_search_text.html" class="btn btn-neutral" title="2.4 字符串匹配和搜索" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left"></span> Previous</a>
      
    </div>
  

  <hr/>

  <div role="contentinfo">
    <p>
        &copy; Copyright 2017, 熊能.

    </p>
  </div>
  Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/rtfd/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>. 

</footer>

        </div>
      </div>

    </section>

  </div>
  


  

    <script type="text/javascript">
        var DOCUMENTATION_OPTIONS = {
            URL_ROOT:'../',
            VERSION:'3.0.0',
            LANGUAGE:'None',
            COLLAPSE_INDEX:false,
            FILE_SUFFIX:'.html',
            HAS_SOURCE:  true,
            SOURCELINK_SUFFIX: '.txt'
        };
    </script>
      <script type="text/javascript" src="../_static/jquery.js"></script>
      <script type="text/javascript" src="../_static/underscore.js"></script>
      <script type="text/javascript" src="../_static/doctools.js"></script>

  

  <script type="text/javascript" src="../_static/js/theme.js"></script>

  <script type="text/javascript">
      jQuery(function () {
          SphinxRtdTheme.Navigation.enable(true);
      });
  </script> 

</body>
</html>